This markdown document is designed and authored to illustrate the concepts, principles and methods used in CAST, a Probablistic Decline Curve Analysis automation script and user interface. The CAST program has been coded in R Script and the user interface has been built in Spotfire to take advantage of its user-friendly visualizations and code integration. This document is meant to explain what the CAST code is doing to the background to better understand the outputs that can be viewed in Spotfire.
Some background information and principles used within the following markdown document will now be outlined. Skip to the next section to begin the illustrations.. Or something like that? > Define ARPS models that have been historically used and which we will evaluate in this document to illustrate PDCA
PDCA - define and explain ARPS and why we would like the PDCA methodology integrated into our workflows.. How do we go about executing PDCA?
Explain Bayesian theory briefly here.. We want a distribution of the forecasts, but don’t have the forecasts to make this from.. Bayesian theory plays part here. Can estimate the Posterior of a distribution by random sampling from a prior distribution..
To explain the principles, let’s look at a sample public dataset of a few wells in Alberta
To illustrate the CAST methods, let’s look at a smaple dataset of a few wells. The sample data contains both oil and gas rates for each producing month taken from the public domain. Let’s look at the rate-time plots for both streams of the sample data.
Looking at the production profiles for each sample well, we can summarize key parameters for each well. Months on production, initial oil and gas rates, peak oil and gas rates, peak month and date information, and cumulative volumes for each phase are all important parameters used in created an ARPs decline. Let’s summarize these parameters for the oil stream.
| UWI | OnProd Date | Initial Oil Rate | Peak Oil Rate | Peak Month | Cum Oil |
|---|---|---|---|---|---|
| 100/03-17-081-16W6/00 | 2018-04-01 | 11.951 | 232.730 | 3 | 905924.2 |
| 100/05-17-081-16W6/00 | 2018-04-01 | 26.418 | 328.338 | 3 | 1839089.1 |
| 100/11-17-081-16W6/00 | 2018-05-01 | 218.892 | 544.714 | 2 | 2728403.2 |
| 100/12-17-081-16W6/00 | 2018-05-01 | 265.438 | 538.424 | 2 | 2338572.9 |
| 102/05-17-081-16W6/00 | 2018-04-01 | 6.919 | 406.334 | 3 | 1134906.0 |
| 102/12-17-081-16W6/00 | 2018-05-01 | 135.235 | 439.671 | 2 | 1704462.9 |
| 103/05-17-081-16W6/00 | 2018-04-01 | 7.548 | 378.658 | 3 | 1660988.3 |
These summary parameters will be re-visited in the extra learning functionality of CAST to prioritize which matches will be completed first.
Let’s define the modified hyperbolic equation in executable form. The equation will accept decline parameters and a time series across which to evaluate production rates. First we’ll need a function to convert secant decline percentage to nominal percentage, then we will define the modified hyperbolic function as “MH”.
## to.nominal function converts effective decline rates to nominal decline rates
to.nominal<-function(De,b=0){
if(b!=0){
return((((1 - De)^(-b))-1)/b)
} else{
return(-log1p(-De))
}
}
## MH function is the modified hyperbolic function, where a single b value transitions
## to an exponential decline at a limiting decline rate
# Function will accept equation constants and an elapsed time value in days?
# Output will be rate at given time for given parameters
MH<-function(qi, b = 1.5, Di = 0.7, Dlim = 0.07, q.abd = 3 ,t = c("Enter In Days")){
#qi must be entered as a rate
#Di must be decimal in effective secant format as % year, less than 1
#t must be entered as days
#Convert Di and Dlim to nominal
ai <- to.nominal(Di,b)
alim <- to.nominal(Dlim)
#Convert t to yearly time fraction
t <- t/365.25
#Add error handlers for true exponential decline and for if intial parameters
# bypass the exponential decline function.. I.e. if di<dlim.. error
#If di > dlim, use Di and no Dlim until end of time
if(b<1e-10){
q <- qi * exp(-ai*t)
} else{
#Calculate limiting decline rate
qlim <- qi*(alim/ai)^(1/b)
#Calculate limiting decline time, in years
tlim <- (((qi/qlim)^b)-1)/(b*ai)
#Calculate q assuming hyperbolic
q<-qi/((1+b*ai*t)^(1/b))
#Replace q when t > tlim
q[t > tlim] <- qlim*exp(-alim*(t[t>tlim] - tlim))
}
#Abandonment cutoff. When q < q.abn, q = 0
q[q < q.abd] <- 1e-10
return(q)
}The second…
Now that we’ve defined the executables of the models we need, how can we automate the fitting of an ARPs curve to a sample production dataset? A methodology for this usually consists of taking a range of input values and evaluate potential matches until the match with the least error is found. To accomplish this we can use a Monte Carlo sampling method to sample 10,000 parameter sets from the input distribution space and evaluate each set against the actual production data. Using a Monte Carlo will not only illustrate the error functions we will use, but will also be used as the initialization phase for the CAST probabilistic process.
Focusing on the Modified Hyperbolic equation for oil, let’s first define an input parameter space within which the solution should exist. To keep it simple, we’ll use uniform distributions for each parameter, contained in a DefineDistribution class object.
# Define distribution constraints for our Monte Carlo sampling of arps parameters
# Only oil values will be defined here.
## bi values
o.bi_min <- 0.5 #Initial oil b value min value
o.bi_max <- 2 #Initial oil b value max value
## di values; as decimals in secant. Will be converted to nominal in arps equation.
o.di_min <- 0.4 #Initial oil decline percentage min
o.di_max <- 0.98 #Initial oil decline percentage max
## dlim values; as decimals in secant.
o.dlim <- 0.07 #Limiting decline percentage used in modified hyperbolic equation
# Abandonment value for oil; as rates in production units
o.abd <- 3 # in bbls/d
# To handle multiple distribution types, distributions will be contained in
# a "DefineDistribution" class. While not a formal class, it will contain structured
# distribution information in a list, namely:
# Type = Normal, Lognormal, or Uniform; min_mean = min bound for uniform or mean of lognormal
# or normal; max_sd = max bound for uniform or standard deviation for lognormal or normal
# Define each parameter distribution
b <- list(type = "Uniform", min_mean = o.bi_min, max_sd = o.bi_max)
di <- list(type = "Uniform", min_mean = o.di_min, max_sd = o.di_max)
dlim <- list(type = "Uniform", min_mean = o.dlim, max_sd = o.dlim)
# Combine all input distrubutions into a list for organization
oil.inputs <- list("b" = b, "di" = di, "dlim" = dlim)
Now the only parameter that is missing from the solution space is qi. This is obviously dependent on the well we are looking to match, so let’s take the 100/05-17-081-16W6/00 well from the sample dataset as our example well. We will define the qi solution space a little differently, using a uniform distribution bounded by peak rate multipliers to keep the samples within reason.
# Let's pull the peak oil rate of the sample well
o.qpeak <- matrix[2,"Oil.Peak.Rate"]
# Then define the qi input distribution. The upper bound will be 3 x peak rate, while the lower
# bound will be ~1/3 x peak rate
qi <- list(type = "Uniform", min_mean = as.numeric(0.3 * o.qpeak),
max_sd = as.numeric(3 * o.qpeak))
# Now append the qi input to the oil.inputs list
oil.inputs <- append(oil.inputs,list("qi" = qi))To understand the input distributions, let’s visualize them. We’ll randomly sample 10,000 iterations from each DefineDistribution class object and plot the results of each parameter input.
Uniform Distributions of b, di and qi
With the solution space defined, we now need to compare each sample set to the actual production to find the best fit. We’ll need to synthesize production rates for each sample set by substituting them into the Modified Hyperbolic equation over a given time period.
Let’s combine and organize our sampled parameters in a data frame. We’ll use an equation-specific function to ensure completeness and repeatability for multi-well matching.
# The MHSample function will sample and combine all parameters based on their
# DefineDistribution classes. Method input will return different output type based on
# requirement to speed up MCMC calculations in the future.. Ignore for now.?
MHSample<-function(n=10000,qi,b,di,dlim){
#Sample relevant parameters for MH function: qi, b, di, dlim
#Inputs of class distribution created prior to function call, containing
# distribution information
# for each parameter
#Function will output a Monte Carlo matrix with n x m dimensions, where n is sample number
# and m is the number of parameters needed for MH equation
#Step through each parameter and call sample.any function
qi.out <- sample.any(n,qi)
b.out <- sample.any(n,b)
di.out <- sample.any(n,di)
dlim.out <- sample.any(n,dlim)
#Combine each returned column into dataframe for output
sampledParams <- data.frame("qi"=qi.out,"b"=b.out,"di"=di.out,"dlim"=dlim.out)
#Return sampled dataframe
return(sampledParams)
}
# Pass the previously defined distributions into the MH.sample function
# Note this is repeating our earlier sampling in a cleaner form
MH.sampled.parameters <- MHSample(n = 10000,qi = qi,b = b,di = di,dlim = dlim)
# Return in table, displaying first 10 rows only
kable(MH.sampled.parameters[c(1:10),], align = 'c') %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive",
full_width = F, position = "center"))| qi | b | di | dlim |
|---|---|---|---|
| 637.8042 | 1.4575283 | 0.7250933 | 0.07 |
| 849.2141 | 1.2513042 | 0.6510075 | 0.07 |
| 279.0781 | 0.7729207 | 0.4385910 | 0.07 |
| 232.6056 | 1.9450797 | 0.6986293 | 0.07 |
| 656.3749 | 1.1937775 | 0.8499618 | 0.07 |
| 280.2020 | 1.6933190 | 0.5598918 | 0.07 |
| 707.6255 | 1.3157287 | 0.4501225 | 0.07 |
| 564.1357 | 0.7495648 | 0.4152572 | 0.07 |
| 835.2198 | 1.7953317 | 0.9688878 | 0.07 |
| 298.7670 | 0.8256256 | 0.4571640 | 0.07 |
Using the combined parameter dataframe and the modified hyperbolic equation, we can define a function to use each row of the parameter dataframe as a Monte Carlo iteration and substitute it into the arps equation across the time interval for the sample well. The function will create and return a mcIteration object for each iteration. This object is a list with slots for Iteration name, Parameters, and Synth. Some additional slots will be created and populated later as we approach cost and forecasting.
# MH.synthesize, will be defined to carry out rate synthesis across the actual time interval.
MH.synthesize<-function(parameters,q.abd,t){
#Parse up parameters and pass into MH function call
fc <- MH(parameters[["qi"]],parameters[["b"]],parameters[["di"]],parameters[["dlim"]],q.abd,t)
#Simplify parameters into a single string representation
parametersName<-paste(names(parameters),round(parameters,digits=2),sep = "=",collapse=" ")
#Create mcIteration imitation list to improve performance and ease of combining data
#Return mcIteration. Remaining list items to be appended by later functions
return(list("Iteration" = parametersName,"Parameters" = parameters,"Synth"=fc))
}To properly synthesize production data for the sample well, we need to create a time vector from the peak oil rate to the last producing month.
# Define a time interval of actuals to synthesize across, using 100/05-17-081-16W6/00
uwi<-"100/05-17-081-16W6/00"
# Subset production including only 100/05-17-081-16W6/00
prod <- formated_production$CD[which(formated_production$CD$Unique.Well.ID ==
uwi),]
# Find peak oil month for this uwi
uwi.pm <- matrix[["Oil.Peak.Month"]][which(matrix$eachUWI==uwi)]
# Refine production datatable to only include data after peak oil rate
prod_ap<-prod[-c(1:(uwi.pm-1)),]
# Find months on prod
mop <- sorted_matrix[which(sorted_matrix$eachUWI == uwi),"Months.on.Prod"]
# Create time vector starting at 0 for peak oil month and using cumulative sum for days
synth.time <- cumsum(c(0,head(prod_ap$Days.In.Month,-1)))
# Let's apply the MH.synthesize function in an apply call to execute across each row
# of the parameters dataframe
oil.synth<-apply(MH.sampled.parameters,1,MH.synthesize,q.abd = o.abd, t = synth.time)
# Print first object of oil.synth to show mcIteration object
print(oil.synth[[1]])## $Iteration
## [1] "qi=637.8 b=1.46 di=0.73 dlim=0.07"
##
## $Parameters
## qi b di dlim
## 637.8041908 1.4575283 0.7250933 0.0700000
##
## $Synth
## [1] 637.8042 492.5824 406.2513 349.5694 310.1796 279.2837 255.6979 235.8914
## [9] 219.4737 206.8542 194.7880 184.6325 175.4065 167.4857 160.1658 153.5850
## [17] 147.8145 142.3815 137.5680 132.9939 128.7763 125.1160 121.4741 118.1879
## [25] 115.0127
Each mcIteration list, defining a Monte Carlo iteration and historical synthesis, can now be evaluated against the actual data to search for an appropriate model.
To compare each synthesis to the historical production data observed, we need to use functions that compare predicted points to actuals for each model. These functions are commonly referred to as loss functions. The most common loss function is the L2 loss function, which is popular due to its easy implementation, especially in linear models. However, the L2 loss is not as useful when comparing data with significant outliers, such as oil and gas production data, because it assigns too much weight to these outliers. We will need to look at more robust loss functions to find models that match the historical data as an engineer would.
Soft L1 loss is a robust, continuous variation of the common Huber Loss that gives less weight to outliers than a L2 loss, making it useful for fitting time series production data. Our Soft L1 definition will return the individual loss of each point and an average cost for all residuals input.
soft_L1<-function(r){
IndvCost <- 2*((1 + abs(r))^0.5 - 1)
Cost <- sum(IndvCost,na.rm = TRUE) / length(IndvCost)
return(list(IndvCost,Cost))
}
To apply the loss equation we’ll define a function to apply the Soft L1 loss to the residuals of each model match. Due to the non-linear nature of production data, the loss function will be applied to the log residuals. The function will populate the IndvCost and Cost slots of each mcIteration object. For this illustration, the function will be defined in the background and we can see the result for the first iteration object.
## $Iteration
## [1] "qi=637.8 b=1.46 di=0.73 dlim=0.07"
##
## $Parameters
## qi b di dlim
## 637.8041908 1.4575283 0.7250933 0.0700000
##
## $Synth
## [1] 637.8042 492.5824 406.2513 349.5694 310.1796 279.2837 255.6979 235.8914
## [9] 219.4737 206.8542 194.7880 184.6325 175.4065 167.4857 160.1658 153.5850
## [17] 147.8145 142.3815 137.5680 132.9939 128.7763 125.1160 121.4741 118.1879
## [25] 115.0127
##
## $IndvCost
## [1] 0.5799130 0.6213472 0.7045805 0.7993099 0.8145477 0.8342588 0.7150210
## [8] 0.8231342 0.8641477 0.9012344 0.9399487 1.0331270 0.9826236 1.0292325
## [15] 1.0180480 1.0188545 1.0867748 1.0848554 1.0857151 1.0637195 1.1046275
## [22] 1.1677599 1.1636212 1.2077019 1.1906771
##
## $Cost
## [1] 0.9533912
Using the cost function applied to each iteration, we can minimize the cost to find our bestfit model. Using this bestfit, we can also visualize the cost model results for each production point in the fit.
Using the log residuals and the Soft L1 cost function, let’s find the top 100 models from our initial 10,000 iterations and see how the matches vary from the bestfit. We’ll colour them by average cost to compare the forecasts.
Top 1% of Models Using Soft L1 Loss
We can see that the top 100 forecasts do a good job of covering the range of actual oil rates, with the lowest cost forecasts tightly grouped around the actuals. However, one observation is that the forecasts are not evenly distributed above and below the actuals to appropriately capture uncertainty in the forecasts. Let’s explore another loss function to see if we can improve this behaviour.
The Soft L1 loss function does a decent job at finding a best fit model, but is there a better loss function that can be used? Let’s look at a loss function defined in SPE-174784-PA (ADD LINK HERE AS REFERENCE) by Fulford et al. that takes a similar approach, but is designed to capture more uncertainty in the production data. This loss function uses the average and standard deviation of the individual cost values across the fit, comparing them to a baseline fit. This will prove particularily useful when we move into the PDCA methods later in the CAST program.
For now, let’s define this loss function as PDCA Loss and see how it improves the fits from our Monte Carlo.
# As our Soft L1 was defined, pdca_loss will return slots for IndvCost and Cost
pdca_loss<-function(r, Eps_mean_min = 0, Eps_std_min = 0){
IndvCost <- r^2 # pdca loss simply uses L2 loss for individual cost
Eps_mean <- mean(r,na.rm = TRUE) #Mean of residuals
Eps_std <- sd(r,na.rm=TRUE) #Std of residuals
Cost <- ((Eps_mean - Eps_mean_min)/.1)^2+((Eps_std - Eps_std_min)/.01)^2
return(list(IndvCost,Cost))
}Just as we did with the Soft L1, let’s use the PDCA loss function to find the top 100 forecasts from the same 10,000 iterations. We’ll set the minimum mean and minimum standard deviation of epsilon to 0, assuming a perfect match. We’ll colour by average cost and compare to the Soft L1 method.
Comparing Top 1% of Models Using Soft L1 and PDCA Loss
We can see that the PDCA Loss function has improved the distribution of fits above and below the actuals, a behaviour that we would like to carry forward because it helps to describe a high and low fit, as well as a bestfit. Lastly, let’s look at the parameter distributions for the top 1% of models to get an idea of the solution space using the PDCA Loss method.
Parameters of P1 model matches using PDCA Loss method
Summarize observations from these distributions.. They are mostly lognormal. Therefore, do we need to step into MCMC to get a proper PDCA distribution going forward? Keep these in mind and compare later on..
Summarize findings in these two charts and how we’re going to carry PDCA loss going forward in the rest of the evaluation / process as it provides a better probabilistic representation. Show distribution fits of the accepted parameters? Or do this at a later time? Or here just FYI??
Using Monte Carlo methods and loss functions to automate the DCA matching based on historicals, we can now take the top 100 models to the next step: forecasting. For each of the accepted models, we will synthesize monthly production rates for the remaining life of the sample well until an abandonment rate of 3 bbls/d is reached.
Visualizing each forecast we can see the distribution appears to summarize our actual data quite well. In a way, we’ve taken the first step towards PDCA. We’ve used a variety of accepted models to define possible oil rate outcomes for the sample well. To illustrate this better, let’s compile the forecasts into P10, Mean, P50, and P90 curves to represent all the forecasts.
The most obvious way to aggregate the individual forecasts would be to simply find the percentile rates at each monthly time interval that represent the P10, Mean, P50 and P90. While this works well in theory, in practice we run into an issue where these aggregations have no single ARPs definition and therefore cannot be used outside of this space. Another approach to consider is using EURs from each of these forecasts to create a probabilistic bin within which we can find the average ARPS equation. Since the correlation between 30 year EUR and ARPs parameters is fairly linear, this method works well. As a proof we will illustrate both methods side-by-side.
Comparison of Aggregation Methods
Both charts show similar results. We can see the P10, P50, and P90 forecasts in dashed lines, overlaid with the mean in solid black. The EUR bin method - which takes a +/- 0.5 percentile interval around the P10, Mean, P50, and P90 - looks to match the actual percentile aggregation well. This illustrates a small sample PDCA output, where we have summarized the range of possible outcomes into percentile forecasts. This has worked well using the Monte Carlo method due to the sample well having clean production data, but this will not always be the case. We will need to investigate other methods to better incorporate a relative view on uncertainty.
Now that we’ve dipped our toes into PDCA, we might as well dive straight in. Revisiting some of the concepts illustrated earlier in this markdown, we’ll see how taking our probabilistic methodology to the next level can take the Monte Carlo PDCA abbreviation and transform it into a robust model that can cope with poor initial fits.
To achieve the desired PDCA outcome that covers the range of uncertainty in production data for all varieties of production data, we will need to incorporate a Markov Chain Monte Carlo (MCMC) method. This will allow us to recycle many of principles we’ve previously applied in the Monte Carlo method above, but add important functionality to further refine the resulting probabilistic forecasts. As part of the MCMC method, once we reach a bestfit approximation of the production data in the burn-in phase (here satisfied by our Monte Carlo minimization), the algorithm will then jump through the solution space. Each jump will be evaluated against the last using the PDCA Loss function defined above to determine whether or not to accept or reject the jump. If accepted, the jump is added to the view of the posterior distribution and the next step start intiates from it. If rejected, a new jump is sampled and evaluated. This is repeated for 10,000 iterations until the posterior distribution is defined. Using the same sample well, we can illustrate the MCMC method as used in the CAST program.
The burn-in process is an initial convergence step that allows the MCMC to start from an equilibirum position, in this case the bestfit from the 10,000 Monte Carlo runs. While the burn-in process is often described as not needed or biased, in the case of PDCA we would like maintain its functionality to give us a baseline fit for evaluation of the PDCA loss function. In plain words, there is more uncertainty in production data that is hard to fit a curve to rather than data that is easy to fit. In this way, we can honour that higher level of uncertainty when needed.
With the bestfit defined as the starting point, we can start sampling through the solution space to carry out the MCMC process. The Metropolis-Hastings algorithm is a method for obtaining a sequence of random samples from multi-dimensional distributions like we have in PDCA. Simply defined, the Metropolis-Hastings method starts at an arbitrary multi-dimensional point (the start of the Markov Chain) and randomly chooses the next multi-dimensional point to evaluate given jumping distributions for each dimension. In our case the inputs parameters for the MH ARPs equation make up each dimension. For each iteration, the candidate sample is compared to the previous sample using an acceptance ratio. If this ratio is greater than 1, the jump is automatically accepted as being more probable than the previous sample. If the ratio is less than 1, the point is less probable than the previous sample, but can still be accepted. Using a random a random probability, sometimes the move will be accepted and sometimes it will be rejected, with acceptance ratios closer to 1 being given a higher probability of acceptance. Thus, we will tend to stay in high density regions and define the desired distribution after sufficient iterations.
The jumping distributions we will use are defined as follows:
Use some formulas to describe the N(i-1,SD) of each jumping distribution
Applying these jumping distributions for 10,000 iterations on our sample well, we can follow the chain of accepted or rejected iterations as the posterior distribution is defined.
Accepted input parameters using MCMC method
The resulting accepted distributions are dependent on the jumping distributions, which need to be tuned appropriately, but we can see the posterior distributions take shape. With more samples than the Monte Carlo illustration, each distribution is well defined, following normal or lognormal distributions.
Add note about staying in the proper solution space (prior distributions) through out entire MCMC process - this is crucial to have the equations make physical sense
Let’s compare the top Monte Carlo matches to the MCMC matches to see how the accepted solution space matches the oil rate behaviour.
Comparing Top 1% Monte Carlo Matches to MCMC Matches
We can see that the matches have a tighter spread around the actuals, returning more forecasts within the acceptable solution space. The MCMC and Metropolis-Hastings algorithm method has achieved the desired result, returning thousands of acceptable models that define the uncertainty in forecasting the sample well. Using the same methods derived earlier in the markdown, let’s compile the accepted forecasts into P10, Mean, P50, and P90 durves to summarize the range of outcomes.
Finally, let’s summarize the EUR distribution and the EURs represented by the probabilistic forecasts.
| qi | b | di | dlim | eur | |
|---|---|---|---|---|---|
| P10 | 347.5136 | 0.9146386 | 0.8459261 | 0.07 | 41339.62 |
| Mean | 302.8022 | 0.8440234 | 0.8419841 | 0.07 | 29471.51 |
| P50 | 295.1474 | 0.8196260 | 0.8366873 | 0.07 | 28128.93 |
| P90 | 267.4491 | 0.7040603 | 0.8292578 | 0.07 | 19216.47 |
Use this section to show what extra outputs the CAST program gives.. how it matches distributions using fitdistrplus and outputs, and outputs the ARPs parameters for the PDCA matches, as well as exports their rate-time data for plotting within the program How to proceed next? Show refined probabilistic forecasts here? Then step into section titled CAST extras that shows what makes some of the CAST algorithms unique? kNN automatic outlier detection, different model matches, learning, enhanced peak detection, etc.?
Sub-headings: 1. Outputs 2. Extras 2.1 Outlier detection 2.2 Learning etc etc..
Darcy Redpath
Petronas Canada
dredpath@petronascanada.com